61 research outputs found
Semi-supervised Segmentation Fusion of Multi-spectral and Aerial Images
A Semi-supervised Segmentation Fusion algorithm is proposed using consensus
and distributed learning. The aim of Unsupervised Segmentation Fusion (USF) is
to achieve a consensus among different segmentation outputs obtained from
different segmentation algorithms by computing an approximate solution to the
NP problem with less computational complexity. Semi-supervision is incorporated
in USF using a new algorithm called Semi-supervised Segmentation Fusion (SSSF).
In SSSF, side information about the co-occurrence of pixels in the same or
different segments is formulated as the constraints of a convex optimization
problem. The results of the experiments employed on artificial and real-world
benchmark multi-spectral and aerial images show that the proposed algorithms
perform better than the individual state-of-the art segmentation algorithms.Comment: A version of the manuscript was published in ICPR 201
Fine-grained Optimization of Deep Neural Networks
In recent studies, several asymptotic upper bounds on generalization errors
on deep neural networks (DNNs) are theoretically derived. These bounds are
functions of several norms of weights of the DNNs, such as the Frobenius and
spectral norms, and they are computed for weights grouped according to either
input and output channels of the DNNs. In this work, we conjecture that if we
can impose multiple constraints on weights of DNNs to upper bound the norms of
the weights, and train the DNNs with these weights, then we can attain
empirical generalization errors closer to the derived theoretical bounds, and
improve accuracy of the DNNs.
To this end, we pose two problems. First, we aim to obtain weights whose
different norms are all upper bounded by a constant number, e.g. 1.0. To
achieve these bounds, we propose a two-stage renormalization procedure; (i)
normalization of weights according to different norms used in the bounds, and
(ii) reparameterization of the normalized weights to set a constant and finite
upper bound of their norms. In the second problem, we consider training DNNs
with these renormalized weights. To this end, we first propose a strategy to
construct joint spaces (manifolds) of weights according to different
constraints in DNNs. Next, we propose a fine-grained SGD algorithm (FG-SGD) for
optimization on the weight manifolds to train DNNs with assurance of
convergence to minima. Experimental results show that image classification
accuracy of baseline DNNs can be boosted using FG-SGD on collections of
manifolds identified by multiple constraints
A New Hole Density as a Stability Measure for Boron Fullerenes
We investigate the stability of boron fullerene sets B76, B78 and B82. We
evaluate the ground state energies, nucleus-independent chemical shift (NICS),
the binding energies per atom and the band gap values by means of
first-principles methods. We construct our fullerene design by capping of
pentagons and hexagons of B60 cage in such a way that the total number of atoms
is preserved. In doing so, a new hole density definition is proposed such that
each member of a fullerene group has a different hole density which depends on
the capping process. Our analysis reveal that each boron fullerene set has its
lowest-energy configuration around the same normalized hole density and the
most stable cages are found in the fullerene groups which have relatively large
difference between the maximum and the minimum hole densities. The result is a
new stability measure relating the cage geometry characterized by the hole
density to the relative energy.Comment: 6 pages, 3 figures, 2 table
Linear Discriminant Generative Adversarial Networks
We develop a novel method for training of GANs for unsupervised and class
conditional generation of images, called Linear Discriminant GAN (LD-GAN). The
discriminator of an LD-GAN is trained to maximize the linear separability
between distributions of hidden representations of generated and targeted
samples, while the generator is updated based on the decision hyper-planes
computed by performing LDA over the hidden representations. LD-GAN provides a
concrete metric of separation capacity for the discriminator, and we
experimentally show that it is possible to stabilize the training of LD-GAN
simply by calibrating the update frequencies between generators and
discriminators in the unsupervised case, without employment of normalization
methods and constraints on weights. In the class conditional generation tasks,
the proposed method shows improved training stability together with better
generalization performance compared to WGAN that employs an auxiliary
classifier
Design of Kernels in Convolutional Neural Networks for Image Classification
Despite the effectiveness of Convolutional Neural Networks (CNNs) for image
classification, our understanding of the relationship between shape of
convolution kernels and learned representations is limited. In this work, we
explore and employ the relationship between shape of kernels which define
Receptive Fields (RFs) in CNNs for learning of feature representations and
image classification. For this purpose, we first propose a feature
visualization method for visualization of pixel-wise classification score maps
of learned features. Motivated by our experimental results, and observations
reported in the literature for modeling of visual systems, we propose a novel
design of shape of kernels for learning of representations in CNNs. In the
experimental results, we achieved a state-of-the-art classification performance
compared to a base CNN model [28] by reducing the number of parameters and
computational time of the model using the ILSVRC-2012 dataset [24]. The
proposed models also outperform the state-of-the-art models employed on the
CIFAR-10/100 datasets [12] for image classification. Additionally, we analyzed
the robustness of the proposed method to occlusion for classification of
partially occluded images compared with the state-of-the-art methods. Our
results indicate the effectiveness of the proposed approach. The code is
available in github.com/minogame/caffe-qhconv
Improving Robustness of Feature Representations to Image Deformations using Powered Convolution in CNNs
In this work, we address the problem of improvement of robustness of feature
representations learned using convolutional neural networks (CNNs) to image
deformation. We argue that higher moment statistics of feature distributions
could be shifted due to image deformations, and the shift leads to degrade of
performance and cannot be reduced by ordinary normalization methods as observed
in experimental analyses. In order to attenuate this effect, we apply
additional non-linearity in CNNs by combining power functions with learnable
parameters into convolution operation. In the experiments, we observe that CNNs
which employ the proposed method obtain remarkable boost in both the
generalization performance and the robustness under various types of
deformations using large scale benchmark datasets. For instance, a model
equipped with the proposed method obtains 3.3\% performance boost in mAP on
Pascal Voc object detection task using deformed images, compared to the
reference model, while both models provide the same performance using original
images. To the best of our knowledge, this is the first work that studies
robustness of deep features learned using CNNs to a wide range of deformations
for object recognition and detection
HyperNetworks with statistical filtering for defending adversarial examples
Deep learning algorithms have been known to be vulnerable to adversarial
perturbations in various tasks such as image classification. This problem was
addressed by employing several defense methods for detection and rejection of
particular types of attacks. However, training and manipulating networks
according to particular defense schemes increases computational complexity of
the learning algorithms. In this work, we propose a simple yet effective method
to improve robustness of convolutional neural networks (CNNs) to adversarial
attacks by using data dependent adaptive convolution kernels. To this end, we
propose a new type of HyperNetwork in order to employ statistical properties of
input data and features for computation of statistical adaptive maps. Then, we
filter convolution weights of CNNs with the learned statistical maps to compute
dynamic kernels. Thereby, weights and kernels are collectively optimized for
learning of image classification models robust to adversarial attacks without
employment of additional target detection and rejection algorithms. We
empirically demonstrate that the proposed method enables CNNs to spontaneously
defend against different types of attacks, e.g. attacks generated by Gaussian
noise, fast gradient sign methods (Goodfellow et al., 2014) and a black-box
attack(Narodytska & Kasiviswanathan, 2016)
Deep Structured Energy-Based Image Inpainting
In this paper, we propose a structured image inpainting method employing an
energy based model. In order to learn structural relationship between patterns
observed in images and missing regions of the images, we employ an energy-based
structured prediction method. The structural relationship is learned by
minimizing an energy function which is defined by a simple convolutional neural
network. The experimental results on various benchmark datasets show that our
proposed method significantly outperforms the state-of-the-art methods which
use Generative Adversarial Networks (GANs). We obtained 497.35 mean squared
error (MSE) on the Olivetti face dataset compared to 833.0 MSE provided by the
state-of-the-art method. Moreover, we obtained 28.4 dB peak signal to noise
ratio (PSNR) on the SVHN dataset and 23.53 dB on the CelebA dataset, compared
to 22.3 dB and 21.3 dB, provided by the state-of-the-art methods, respectively.
The code is publicly available.Comment: Accepted to 24th International Conference on Pattern Recognition
(ICPR 2018). 6 pages, 7 figure
Improving Head Pose Estimation with a Combined Loss and Bounding Box Margin Adjustment
We address a problem of estimating pose of a person's head from its RGB
image. The employment of CNNs for the problem has contributed to significant
improvement in accuracy in recent works. However, we show that the following
two methods, despite their simplicity, can attain further improvement: (i)
proper adjustment of the margin of bounding box of a detected face, and (ii)
choice of loss functions. We show that the integration of these two methods
achieve the new state-of-the-art on standard benchmark datasets for in-the-wild
head pose estimation.Comment: IEEE International Conference on Automatic Face & Gesture Recognition
(FG2019
Information Potential Auto-Encoders
In this paper, we suggest a framework to make use of mutual information as a
regularization criterion to train Auto-Encoders (AEs). In the proposed
framework, AEs are regularized by minimization of the mutual information
between input and encoding variables of AEs during the training phase. In order
to estimate the entropy of the encoding variables and the mutual information,
we propose a non-parametric method. We also give an information theoretic view
of Variational AEs (VAEs), which suggests that VAEs can be considered as
parametric methods that estimate entropy. Experimental results show that the
proposed non-parametric models have more degree of freedom in terms of
representation learning of features drawn from complex distributions such as
Mixture of Gaussians, compared to methods which estimate entropy using
parametric approaches, such as Variational AEs.Comment: Information Theor
- …